Traced fast fourier transform apparatus and method

ABSTRACT

A Fast Fourier Transform (FFT) arrangement for use in those situations in which not all of the outputs are desired is controlled in such a fashion that at least those multiplications (and possibly those additions) are not performed which do not contribute toward the desired outputs. The technique is usable in those situations in which the desired output signals are noncontiguous, or are in noncontiguous bins. The technique includes signal preprocessing in which the indices are adjusted so that the index for a particular stage points to those butterflies of the previous stage which contribute toward its output. The FFT is performed on the indexed data. In one embodiment, a pipelined FFT processor is controlled in a corresponding manner.

CLAIM OF PRIORITY

This application claims priority of provisional application 60/196,028,filed Apr. 7, 2000.

FIELD OF THE INVENTION

This invention relates to fast FFT, and more particularly to techniquesfor such FFT which eliminate the need to perform at least certainmultiplications and/or additions, when those multiplications and/oradditions are not necessary to the generation of particular unwantedones of the set of outputs.

BACKGROUND OF THE INVENTION

The Fast Fourier Transform (FFT) is widely used for generation ofspectrum information from sets of data which vary in time (spectralanalysis), and in the reverse direction for determining the timefunction which is equivalent to a particular spectrum (filtering). TheFFT is widely used in communications applications such as indemultiplexing, and is generally described by Brigham in “The FastFourier Transform and its Applications”, Prentice-Hall, 1988. In itsdemultiplexing role, the algorithm is used in multicarrierdemultiplexing, as described by Crochiere et al. in “Multirate DigitalSignal Processing, Prentice-Hall, 1983. The FFT can be implemented as adecimation-in-time function or as a decimation-in-frequency (DIF)function or algorithm, and can be implemented in hardware, in softwarerunning on a general-purpose processor, or as a pipelined structureadapted for the application. The pipelined structure is veryadvantageous for many applications.

FIG. 1 is an illustration of the transposed canonical signal flow graph10 which is ordinarily used to explain the operation of the FFT. In FIG.1, the flow graph includes a plurality 2^(S) of input nodes or points atthe left of the FIGURE, where S is an integer representing the number ofFFT stages. The value of 2^(S) is sixteen in the particular example ofFIG. 1, and these input nodes or points are numbered 0 to 15. There aresimilarly sixteen output ports at the right, numbered in like fashion.Lying between the input and output ports are S stages of butterflies,where S=4 in this example. The first stage (stage 1) of butterflies haspaired inputs and outputs, so that the sixteen input ports are coupledto eight individual butterfly groups. In particular, input nodes 0 and 1are coupled to a first butterfly, designated 0, of the first butterflygroup of stage 1. Similarly, input nodes 2 and 3 are coupled to a secondbutterfly of the second butterfly group of stage 1, input nodes 4 and 5are coupled to a third butterfly group of stage 1, input nodes 6 and 7are coupled to a fourth butterfly group of stage 1, and so forth. Eachof the butterfly groups in the first stage includes a single butterfly,designated 0 at the crossing of the two lines of the butterfly. The lastbutterfly group of stage 1 is the eighth butterfly group, which isconnected to input nodes 14 and 15. Each butterfly group of stage 1 ofFIG. 1 has a pair of output nodes. In FIG. 1, the output nodes of thefirst butterfly are designated 0 and 1, the output nodes of the secondbutterfly are designated 2 and 3, the output nodes of the thirdbutterfly group are designated 4 and 5, the output nodes of the fourthbutterfly group are designated 6 and 7, the output nodes of the fifthbutterfly group are designated 8 and 9, the output nodes of the sixthbutterfly group are designated 10 and 11, and the output nodes of theseventh butterfly group are designated 12 and 13. The output nodes ofthe eighth butterfly group are designated 14 and 15.

The output sum (+) and difference (−) signals from each butterfly groupof the first stage of FIG. 1 appear at first-stage output nodes 0through 15, corresponding to the second-stage input nodes, which aregrouped into sets of four. Thus, first-stage output ports 0, 1, 2, and 3correspond to input ports 0, 1, 2, and 3 of the first butterfly group ofthe second stage of butterflies. First stage output ports 4, 5, 6, and 7correspond to second-stage input ports 0, 1, 2, and 3 of the secondgroup of butterflies of the second stage. First-stage output ports 8, 9,10, and 11 correspond to input ports 0, 1, 2, and 3, respectively, ofthe third set of butterflies of the second stage. Lastly, output ports12, 13, 14, and 15 of the first stage of butterflies correspond to inputports 0, 1, 2 and 3 of the fourth set of butterflies of the secondstage. The second stage of butterflies is thus seen to be divided intofour groups, each containing two butterflies, designated 0 and 1. Thefirst and third inputs of each butterfly group of the second stage sharethe first butterfly of the group, namely the one designated 0, and thesecond and fourth inputs share a second butterfly, namely the onedesignated 1. This is true for each of the four groups or sets ofbutterflies of the second stage of FIG. 1.

In the arrangement of FIG. 1, the third-stage butterflies are groupedinto two sets, each having its input ports numbered from 0 to 7. Thus,output port 0 of the first butterfly group of stage 2 is connected to orcorresponds to input port 0 of the first group or set of butterflies ofstage 3, output port 1 of the first butterfly group of stage 2corresponds to input port 1 of the first butterfly group of stage 3,output port 2 of the first butterfly group of stage 2 corresponds toinput port 2 of the first butterfly group of stage 3, and output port 3of the first butterfly group of stage 2 corresponds to input port 3 ofthe first group of butterflies of the third stage of butterflies. Outputport 0 of the second butterfly group of stage 2 corresponds to inputport 4 of the first stage of butterflies of stage 3, output port 1 ofthe second butterfly group of stage 2 corresponds to input port 5 of thefirst butterfly group of stage 3 of butterflies, output port 2 of thesecond group of butterflies of stage 2 corresponds to input port 6 ofthe first butterfly group of stage 3, and output port 3 of the secondgroup of butterflies of stage 2 corresponds to input port 7 of the firstbutterfly group of stage 3. In a similar manner, output ports 8, 9, 10,and 11 of the third butterfly group of stage 2 correspond to input ports0, 1, 2, and 3, respectively, of the second group of butterflies of thethird stage of FIG. 1. Lastly, output ports 12, 13, 14, and 15 of thefourth butterfly group of stage 2 of FIG. 1 correspond to input ports 4,5, 6, and 7, respectively. Thus, the third stage of butterflies ispartitioned into two groups, namely the upper group of four butterflies,designated 0, 1, 2, and 3, associated with, or having, output ports 0through 7, respectively, and the lower group of four butterflies, alsodesignated 0, 1, 2, and 3, having output ports 8 through 15.

In FIG. 1, output ports or nodes 0, 1, 2, 3, 4, 5, 6, and 7 of the firstbutterfly group of stage 3 correspond to like-numbered input ports ofthe single butterfly group of stage 4, and output ports 8, 9, 10, 11,12, 13, 14, and 15 of the second butterfly group of stage 3 correspondto like-numbered input ports of the single butterfly group of stage 4.Thus, the butterfly group of the last or fourth stage of butterflies isin one monolithic group, or in other words is not divided into groups,and its eight individual butterflies are designated 0 through 7. Moreparticularly, butterfly 0 of the fourth-stage butterfly group isassociated with output nodes 0 and 8, butterfly 1 is associated withoutput ports 1 and 9, butterfly 3 is associated with output ports 2 and10, butterfly 4 is associated with output ports 3 and 11, butterfly 5 isassociated with output ports 4 and 12, butterfly 6 is associated withoutput ports 5 and 13, butterfly 7 is associated with output ports 6 and14, and butterfly 8 is associated with output ports 7 and 15.

FIG. 2 a illustrates a single butterfly representation of a first type,which can apply to any one butterfly or line-crossing of FIG. 1, andFIG. 2 b represents a different type of butterfly, which can also beused in the representation of FIG. 1. In FIG. 2 a, the butterfly is ofthe type used with a decimation-in-frequency (DIF) FFT operation whenapplied to FIG. 1. The butterfly of FIG. 2 b is of the type used with adecimation-in-time (DIT) FFT operation when applied to FIG. 1. FIG. 2 aillustrates one butterfly representation, which can apply to any onebutterfly of line-crossing of FIG. 1. In FIG. 2 a, the butterfly is of atype used with a Decimation-in-frequency (DIF) FFT operation whenapplied to FIG. 1. In the arrangement of FIG. 2 a, the butterfly 220includes an input node coupled to an input port A, another input nodecoupled to another input port B, a + output node coupled to an outputport C, and a further − output node coupled by way of a weighting ortwiddle factor multiplier 222 to output port D. The butterfly 220 ofFIG. 2 a can be represented by the symbol designated 226.

In the arrangement of FIG. 2 b, the butterfly 210 includes an inputcoupled to port A, and a second input node coupled to input port B. Inaddition, the + output node of the butterfly of FIG. 2 b is coupled toan output port C, and the − output nod is coupled to output port D. Aweighting or twiddle factor multiplier 214 is coupled between input portB and the lower input node 212 of butterfly 210. The butterfly of FIG. 2b can be represented by the symbol designated 216.

In operation of the flow graph of FIG. 1 for use with a prior-art FFT,the input data points are assumed to have been buffered, and a set ofsixteen data points is available for application to the input nodes ofthe flow graph of FIG. 1. Thus, a particular complex number is appliedto each input node 0 through 15.

In general, there are S stages, which number four in the arrangement ofFIG. 1. At the

stage (where i≦S and i≧1), there are 2^(S−i) butterfly groups, eachgroup containing 2^(i−1) butterflies. The input ports of the i^(th)stage butterfly group are labelled from 0 to (2^(i)−1). Input port j(0≦j≦2^(i)−1) and (j+2^(i−1)% 2^(i)) share the same butterfly.

The operation of an FFT can be implemented in software. An illustrativeexample of a prior-art software for performing an FFT, in C language, is

void FFT(int N, int s, int **indexSet, complex *x) { int i, j, j2, k; //counters int nRep; / /Index spacing between adjacent butterfly Groupint numBFL; / /number of Butterflies per Group float twoPi; float ang; //twiddle factor unit phase float TWF; / / twiddle factor float c,s; //Cosine and Sine storage variable complex tempData; Nrep=1;twoPi=2*3.14159265; for (i=0;i<s; i++) / /number of stages loop {numBFL=Nrep; / /number of butterflies per group at stage s Nrep=2*nRep;ang=twoPi/nRep; for (j=0;j<numBFL;j++) { / /Calculate the twiddlefactors TWF=ang*j; c=cos(TWF); s=sin(TWF); / /update the data for (k=j:k<N; k+=nRep) {j2=k+numBFL; tempData=x(j2) *CMPLX(c,s); x(j2) =x(k) −tempData; x(k) =x(k) + tempData; } } } }The underlined portions of the prior-art FFT processing are those inwhich changes are made to implement the method of the invention, asdescribed below.

In general, the entire FFT is calculated in the prior art, even if onlya few of the output points are required. There are applications in whichthe required FFT outputs are sparse, as for example in which the desiredoutputs correspond to only certain bins of the FFT output or in narrowfrequency windows. In the multicarrier demodulation context, it might bedesired to extract only one or a few noncontiguous carriers from themulticarrier input signal. FFT pruning is known for reducing thecomputational burden. Such pruning is described by Markel in “FFTpruning”, published at pp 305-311 in the IEEE Transactions on AudioElectroacoustics, Vol. Au-19, December 1971; Skinner in “Pruning thedecimation-in-time FFT algorithm,” published at pp 193-194 of IEEETrans. Acoustics, Speech, and Signal Processing, vol ASSP-24, April1976; Sreenivas et al., in “FFT algorithms for both input and outputpruning,” published at pp 291-292 of IEEE Trans. Acoustics, Speech, andSignal Processing, vol ASSP-27, June 1979; Sreenivas et al., in“High-resolution Narrow-Band Spectra by FFT pruning,” published at pp254-257 of IEEE Trans. Acoustics, Speech, and Signal Processing, volASSP-28, April 1980; and Nagai, in “Pruning the decimation-in-time FFTAlgorithm with frequency shift,” at pp 1008-1010 of IEEE Trans.Acoustics, Speech, and Signal Processing, vol ASSP-34, August 1986.However, the pruning described in these sources appears to be appliedonly when the set of output points or bins is continuous, which is tosay when the outputs are in continuous windows, or require specialstructures. It should be noted that, since the FFT is cyclic, outputsare (or can be considered to be) continuous when they extend from thehighest-numbered back to zero. In our example, that is to say, that theoutput node or port group numbered 14, 15, 0, 1 is a continuous orcontiguous group, since the transition between nodes numbered 0 and 15is not considered to be discontinuous. On the other hand, the outputnode group 14, 0, 1 would be considered to be discontinuous, since anon-selected port (port 15) lies within the sequence.

Improved pruning techniques for FFT are desired.

SUMMARY OF THE INVENTION

A method according to an aspect of the invention is for fast fouriertransform on a digital series to produce signals in cyclicallynoncontinuous output bins, by radix 2 FFT. The method comprises the stepof determining the required outputs from such factors as the number2^(S) of FFT points, the output bin index O_(S), and the input signalarray. The butterfly index for the last stage (stage S) of thetransposed canonical flow graph is determined by

$\begin{matrix}{\Psi_{S - 1} = {O_{S}\;\%\left( \frac{N}{2} \right)}} & (1)\end{matrix}$where Ψ_(s−1) represents the butterfly index for stage S. The butterflyindex is, for example, represented by the numbers at the crossing pointor crossings of butterflies of stage S=4 in FIG. 3. Followingdetermination of the butterfly index for the first stage, the butterflyindices for all other stages are determined by

$\begin{matrix}{\Psi_{l - 1} = {\Psi_{l}\mspace{11mu}\%\left( \frac{N}{2^{S - l + 1}} \right)}} & (2)\end{matrix}$where:

-   -   Ψ_(l−1) represents the butterfly index for stage l (l≠S); and    -   l varies from 1 to (S−1).        The butterfly indices so determined are sorted or placed in        ascending order if not already in ascending order. Finally,        using the butterfly index, only those butterflies necessary for        calculation of the output bins are calculated.

In a particular mode of the method of the invention suited for use witha pipelined FFT implementation, the step of calculating only thosebutterflies necessary for calculation of the output bins is performed bysteps including setting the (j+1)^(th) butterfly index set Ψ_(j), where(1≦j≦S−1) and mapping from the (j+1)^(th) stage butterfly index setΨ_(i) to the j^(th) stage memory bits m_(j) ^(i) (1≦j≦S−1,0≦i≦2^(j−i)−1), by

-   -   (a) for (1≦j≦S−1), (memory bits for stage j)        -   (i) if kεΨ_(j) contains index k, where k is the upper index            for the memory bit representation, then setting m_(j)            ^(k)=1, and        -   (ii) if (k∉Ψ_(j)), then setting m_(j) ^(k)=0;    -   (b) for j=S, (memory bits for stage S or the last stage)        -   (i) if (kεO_(S)), or O_(S) contains index k, then setting            m_(j) ^(k)=1.        -   (ii) if (k∉O_(S)), then setting m_(j) ^(k)=0; and

Controlling the operation of the

stage of the pipelined FFT by control of the memory pair m_(j) ^(i)(0≦i≦2^(j−1)−1) and m_(j) ^(i+Y), (Y=2^(j−1)).

In one mode, the step of setting the butterfly index includes the steps,when 0≦i≦(2^(j−1)−1), of:

-   -   controlling the active/sleep mode of the butterfly adder with        m_(j) ^(i);    -   controlling the active/sleep mode of the butterfly subtractor        with m_(j) ^(i+Y); and    -   controlling the active/sleep mode of the butterfly multiplier in        accordance with the Boolean OR of m_(j) ^(i) and m_(j) ^(i+Y).

BRIEF DESCRIPTION OF THE DRAWING

FIG. 1 is a simplified transpose canonical flow diagram of a prior-artarrangement for producing a fast fourier transform;

FIGS. 2 a and 2 b are simplified block diagrams illustrating two typesof butterflies which may be used in the arrangement of FIG. 1, and alsoillustrating symbols therefor;

FIG. 3 is a simplified transpose canonical signal flow graph or chartfor explaining preprocessing according to an aspect of the invention;

FIG. 4 is a simplified conventional logic flow chart or graphillustrating the overall logic flow according to the invention forproducing an FFT output by steps including preprocessing and traced FFTprocessing;

FIG. 5 is a simplified conventional logic flow chart of graphillustrating the logic flow for the preprocessing of FIG. 4;

FIG. 6 is a simplified conventional logic flow chart of graphillustrating the logic flow for the generation of the FFT output by thetraced FFT method of FIG. 4;

FIG. 7 is a simplified block diagram of a pipeline processor controlledaccording to an aspect of the invention for performing only thosebutterflies required for generating specified sparse outputs; and

FIGS. 8 a through 8 o represent various signals which appear in thestructure of FIG. 7 during representative operation;

FIGS. 9 a and 9 b are simplified block diagrams of controllers orcontrol systems for generating processor control signals for DIT and DIFpipelined FFT processing, respectively; and

FIGS. 10 and 11 are illustrations of how the i^(th) stage of butterflyis controlled in the DIT and DIF processors of FIGS. 9 a and 9 b,respectively.

DESCRIPTION OF THE INVENTION

FIG. 3 is a simplified transpose canonical signal flow graph or diagramuseful in explaining the preprocessing to reindex the butterflies foreach stage in accordance with an aspect of the invention. It should benoted that, even though the flow graph of FIG. 3 is very similar to thatof FIG. 1, it is used to describe preprocessing, rather than theoperation of the FFT derivation. In FIG. 3, the fourth-stage outputs aredesignated 0 through 15, just as in FIG. 3. However, the application inthis example requires only two output points, namely points 3 and 6.According to an aspect of the invention, the processing is modified insuch a manner that only those multiplications associated with thosebutterflies which take part in producing the desired fourth-stageoutputs on ports 3 and 6 are performed. Additions and subtractionsrequire very small amounts of processing power. Ideally, the additionsand subtractions associated with such non-used signals would also beeliminated. In accordance with another aspect of the invention, some ofthe ports or nodes of some of the stages of the structure of FIG. 3 areredesignated by comparison with the designations of FIG. 1. Also, someof the paths are shown as dotted lines, while other paths are solidlines. In particular, the output nodes of stage 1 of the butterfly arrayof FIG. 3 is renumbered from 0 through 15 to a sequence 0, 1, 0, 1 . . .0, 1. Also, the output ports of the second stage of butterflies isrenumbered from 0 through 15 as in FIG. 1 to the array 0, 1, 2, 3, 0, 1,2, 3, . . . 0, 1, 2, 3. The output ports designated 0 through 15 of thethird stage of butterflies of FIG. 1 is renumbered to 0 through 7, 0through 7. The purpose of the redesignation of the nodes 5 or ports isto permit the program which processes the data to identify the pathswhich, when traced back, identify those nodes and butterflies whichcontribute toward the desired output signals. For example, one of thetwo selected output signals in the arrangement of FIG. 3 is that signalat output 3, designated by a large dot. Output 3 is in the samebutterfly as output 11, which is not selected. Output node 3 of thefourth stage of butterflies connects by a solid line to output port 3 ofthe third stage of butterflies of both the upper and lower butterflygroups of stage 3. Similarly, output node 6 of the fourth stage ofbutterflies of FIG. 3 is identified by a large dot, and is connected bysolid lines to output nodes 6 of both the upper and lower butterflygroups of stage 3. The advantage of the redesignation becomes apparent,in that the fourth stage butterflies contributing to the desired outputscan be determined from the third stage output node index.

Continuing with FIG. 3, those butterflies of the second stage ofbutterflies contributing toward the outputs 3 and 6 of the upper andlower butterfly sets of the third stage are identified by the sameindices. More particularly, output node 3 of the uppermost butterfly setof the third stage is connected by solid lines to output port 3 of theuppermost butterfly set of the second stage of butterfly sets, and tooutput node 3 of the second butterfly set of the second stage.Similarly, output node 3 of the lowermost one of the butterfly sets ofstage 3 is connected by solid lines to output nodes 3 of the third andfourth butterfly sets of stage 2. A similar examination reveals thatoutput nodes 6 of the upper and lower butterfly sets of stage 3 of thestructure of FIG. 3 are connected by solid lines to output nodes 2 ofthe four butterfly sets of stage 2. In this case, the “2” index can bedetermined as 6 modulo 4. In a very similar manner, using thecalculation of 2 modulo 2=0, the “2” designated output ports of thesecond stage of butterflies are connected by solid lines to the “0”designated ports of the first stage. Using the calculation of 3 modulo2=1, the “3” designated output ports or nodes of the second stage ofbutterfly sets are connected to the “1” output ports of the butterfliesof the first stage. It will be noted that a large dot appears at each ofthe output ports of the butterfly groups of the first stage of FIG. 3.This means that all the outputs of the first stage of butterflies areused; however, in the remaining stages, less than all of the butterfliesare used to generate the desired sparse results.

It should be noted that, in each stage of the structure of FIG. 3, theindex identifying the output node for which an output signal is producedcan be determined, at each stage, by the index itself, counted modulo.More particularly, at each stage, the desired-output index, countedmodulo 2 ^(i−1), where i is the stage number. Thus, for the example ofFIG. 3, in which 3 and 6 were selected as the desired outputs from thelast stage, the butterflies of the output stage 4 which contributetoward the desired output signals are 3 modulo (2³=8), and 6 modulo 8,corresponding to 3 and 6, respectively. This identifies thosebutterflies designated 3 and 6 in the output stage as contributingtoward generating the desired signals. The remaining butterflies 0, 1,2, 4, 5, and 7 of the output stage do not contribute toward the desiredoutputs. In the penultimate stage (stage 3) the 3- and 6-indexed outputstage output node butterfly indices, counted modulo 4, give new indices3 and 2, respectively. Thus, only butterflies 3 and 2 in the upper andlower butterfly sets or groups of stage 3 need to execute, and all theothers may remain quiescent. In the antepenultimate stage, namely stage2, the indices can be determined by output-stage indices 3 and 6,counted modulo 2, which correspond to 1 and 0, respectively. Thus, thebutterflies required to execute in the second stage are those designated0 and 1. In the first stage, the indices can be determined by outputstage indices 3 and 6 counted modulo 2 ⁰=1, which generates 0 for allthe output indices. Thus, all the butterflies of the first stage arerequired to execute. This completes the preprocessing of the signals inaccordance with an aspect of the invention.

A “C” language program for performing preprocessing according to theabove aspect of the invention is given by

void Preprocess(int **indexSet, int N, int s, int *status) { int i, j,ctr; / /dummy loop counters int middleInd, Ind; / / the middle index andend index middleInd=N; / /initialization for(j=s−1; j>=0; j−−) / /sstages FFT, start from the last stage  { Ind=middleInd; / / store endindex  middleInd= Ind/2 ; / /The following for loops implement thealgorithm mentioned above for (i=middleInd;i<Ind; i++) { if (status[i])status [i-middleInd]=1; } ctr=0; / /The following for loops store theindex set to indexSet for (i=0;i<middleInd;i++) { if (status[i])indexSet[ctr++] [j]=1; } indexSet[ctr] [j]=EndOfList; / /set the tail ofindexSet } }

A method according to the invention is illustrated in the flow chart ordiagram of FIG. 4. In FIG. 4, the logic begins at START block 10, andproceeds to a block 12, which represents the reading of the number ofFFT points, which is a number represented by N=2^(S). From block 12, thelogic flows to a block 14, representing the reading of the output binindex set O_(S), and to a block 16, representing the reading of the Nelements of the data series (the input data). The output bin index setis a representation of the output bins for which the FFT is desired, andthe other bins are unwanted information. From block 14, the logicproceeds to a preprocessing step illustrated as a block 18, in which thevarious indexes are processed by modulo counting, as described inconjunction with FIG. 3. From blocks 16 and 18, the logic flows to afurther block 20, which represents traced FFT pruning, to produced thedesired FFT data in the selected output bins. From block 20, the logicflows to an END block 22.

FIG. 5 is a simplified logic flow chart or diagram illustrating thelogic for implementing block 14 of FIG. 4. In FIG. 5, the logic arrivesfrom logic path 15 at a block 218, which represents the generation ofthe

stage butterfly index set Ψ_(s−1)Ψ_(s−1) =O _(S)%(N/2)  (3)where O_(S)% (x) represents the result operating on O_(S) modulo x. Fromblock 218 of FIG. 5, the logic flows to a further block 220, whichrepresents generation of the

stage butterfly index Ψ_(l−1)

$\begin{matrix}{\Psi_{l - 1} = {\Psi_{l}\mspace{11mu}\%\left( \frac{N}{2^{S - l + 1}} \right)}} & (4)\end{matrix}$From block 220, the logic proceeds by way of logic path 19 to block 20of FIG. 4.

FIG. 6 is a simplified logic flow chart or diagram illustrating theoperation of the traced FFT pruning BLOCK 20 of FIG. 4. In FIG. 6, thelogic flow arrives over logic path 19 at a block 310, which representsthe re-indexing of the input data sequence x₀, . . . , x_(N−1) tox₀, . . . , x₂ _(S) ⁻¹  (5)From logic block 310 of FIG. 6, the logic flows to a block 312, whichrepresents the setting of variables nRep and i to nRep=1 and i=0. Fromlogic block 312, the logic flows to a further block 314, representingthe setting of the number of butterflies nBF equal to variable nRep.Block 316 represents the resetting of the value of nRep to double itscurrent value, namely nRep=2 nRep. The doubling of nRep represents theangle of the twiddle factor for the current stage. In block 318, thevalue of θ is set to 2π/nRep. Block 320 represents the setting of n=0.

From block 320 of FIG. 6, the logic flows to a block 322, whichrepresents the setting of αα=Ψ_([i][m])Θ  (6)where Ψ_([i][m]) represents the

element of Ψ_(i). From block 322, the logic flows to a block 324, whichrepresents the determination of the twiddle factor TWF=exp[−jα]. Fromblock 324, the logic flows to a block 326, which represents thecalculation of k=Ψ_([i][m]). From block 326, the logic flows to afurther block 328, which represents the setting of a temporary variabletmp to tmp=x[k+nBF]·TWF. The next block, namely block 330, sets

-   -   x[k=nBF]=x[k]−tmp    -   x[k]=x[k]+tmp        From block 330, the logic flows to a block 332, increments the        inner or fastest loop index k=k+1. From block 332, the logic        proceeds to a decision block 334, which makes the comparison        k<N, and if this is true, the logic leaves decision block 334 by        the YES output, and proceeds by way of logic path 336 back to        block 328, to recalculate the twiddle factor for the next value        of k. Eventually, the fastest loop will have calculated all        values of k up to N, and the logic will then leave decision        block 334 by the NO output, and proceed to a block 338. Block        338 increments the value of running variable m, so that m=m+1.        From block 338, the logic flows to a further decision block 340,        which examines m. If the current value of m<number (#) of        elements in ψ_(i), the logic leaves decision block 334 by the        YES output, and proceeds by way of loopback logic path 342 to        block 322. From block 322, the logic proceeds through blocks        324, 326, 328, 330, and 332, recalculating for all values of m        up to m=number of elements in ψ_(i). When m=number of elements        in ψ_(i), the logic leaves decision block 340 by the NO output,        and proceeds to a block 344, which represents the incrementing        of variable i to i+1. Decision block 346 examines variable i,        and returns the logic by way of loopback logic path 348 to block        314 to continue calculation. All the calculations are again        performed for the current value of i so long as i<S. Eventually,        the value of i will be equal to S, and the logic will then leave        decision block 346 by the NO output and proceed to the END block        350, with all the FFT calculations having been made for one set        of input data.

The results of the required output are in x[j], where jεO_(S).

The flow chart of FIG. 6 can be implemented in C language as

void TFFTP(int N, int s, int *indexSet[s], complex &x[N]) { int i, j,j2, k; //counters int nRep; //index spacing between adjacent butterflyGroup int numBFL; //number of Butterflies per Group float twoPi; floatang; //twiddle factor unit phase  float TWF; //twiddle factor float c,s;//Cosine and Sine storage variable complex tempData; nRep=1,twoPi=2*3.14159265, for(i=0;i<s; i++) //number of stages loop {numBFL=nRep; //number of butterflies nRep=2*nRep; ang=twoPi/nRep; //Onlythe data in the list are calculated for (m = 0;indexSet[m][i]!=EndOfList:m++) _(—) { //Calculate the twiddle factorsj=indexSet[m][i];  TWF=ang*j;  c=cos(TWF);  s=sin(TWF);  //update thedata  for (k=j; k<N; k+=nRep)  {j2=k+numBFL; tempData=x(j2)*complex(c,s);  x(j2)=x(k)−tempData;  x(k)=x(k)+tempData; } } } }

FIG. 7 is a simplified block diagram of a conventional four-butterflypipeline processor for producing an FFT output signal in response tosixteen input signals applied to an input port 710 (location A). Theseinput signals are illustrated in FIG. 8 a as starting at time 0, and arein the form of a stream of numbers designated in FIG. 8 a as 1, 2, 3, 4,5, 6, 7, 8, 9, 10, 11, 12, 13, 14, and 15, corresponding to the inputblock of signals for a sixteen-point FFT. The signals are demultiplexedto locations B and C by a switch 712 operating at twice the system clockrate, with the resulting signal streams at locations B and C representedby the B and C signals of FIG. 8 b. The signals at locations B and C ofFIG. 7 are delayed by one clock cycle relative to the starting time 0,as a result of operation of the switch 712. The butterfly symbol 714 ofFIG. 7 is identified in FIG. 2 b. The division of the input signals ofFIG. 8 a into two associated groups designated A and B (at locations Aand B of FIG. 7) corresponds to the grouping of input signals 0 through15 in FIG. 1 into pairs for application to the butterflies of stage 1 ofFIG. 1. More particularly, in FIG. 8 b, the B,C pairs are pair 0,1; 2,3;4,5; 6,7; 8,9; 10,11; 12,13; and 14,15, occurring sequentially, ratherthan in parallel. Physically, there is but a single input-stagebutterfly in FIG. 7, which operates at the system clock rate. At thefirst local clock cycle, butterfly 714 of FIG. 7 processes input signals0,1; at the second clock cycle, it processes input signals 2,3; at thethird clock cycle, it processes input signals 4, 5, and so forth, takingeight local system clock cycles to process all of the sixteen inputsignals of FIG. 8 a. The results of the butterfly operation by butterflyprocessor 714 appear at locations D and E of FIG. 7, as indicted in FIG.8 c. The indicated “start time” of “3” of FIGS. 8 c and 8 d assumes thatthere is a two-clock-cycle delay in traversing from location C tolocation D of butterfly 714 of FIG. 7. The lower branch output signalfrom butterfly 714 is delayed by one clock cycle in a delay (Z⁻¹)element 716. The signals at locations F and G of FIG. 7, then, arerepresented by FIG. 8 d. In FIG. 8 d, the upper branch is notillustrated as being delayed, but the lower branch is illustrated asdelayed by one local clock cycle; that is to say, the numbers at the Flocation appears one clock cycle prior to (to the left of) G in FIG. 8d. The signals at locations F and G are applied to a switch illustratedas 718 in FIG. 7, which operates at half the local clock rate. Switch718 has two states, namely straight-through coupling from F to H andfrom G to I, and criss-cross coupling from F to I and from G to H. InFIG. 8 e, the uppermost or logic 1 level of the half-local-clock rateswitch state represents straight-through operation of switch 718, andthe logic-0 level of the signal of FIG. 8 e represents criss-crossoperation.

The criss-cross operation of switch 718 of FIG. 7 results in thecoupling of signal 0 from F of FIG. 8 d to H of FIG. 8 f during thefirst switch clock logic high state of FIG. 8 e, coupling of signal 2from F of FIG. 8 d to I of FIG. 8 f and of signal 1 from G of FIG. 8 dto H of FIG. 8 f during the second switch clock cycle, coupling ofsignals 3 and 4 from locations G and F, respectively, to locations I andH, respectively, during the third switch clock cycle; the coupling ofsignals 6 and 5 from locations F and G, respectively, to locations I andH, respectively, the coupling of signals 7 and 8 from locations G and F,respectively, to locations H and I, respectively; the coupling ofsignals 9 and 10 from locations G and F, respectively, to locations Hand I, respectively; the coupling of signals 11 and 12 from locations Gand F, respectively, to locations I and H, respectively; the coupling ofsignals 13 and 14 from locations G and F, respectively, to locations Hand I, respectively; and the coupling of signal 14 from location G tolocation I, during subsequent clock cycles of FIG. 8 e, as illustratedin FIGS. 8 d, 8 e, and 8 f.

The signals at location H at the upper output of switch 718 of FIG. 7are coupled to a location H′ at an input of a further pipeline butterfly720 by way of a one-local-clock delay element 722, and the signals atlocation I are coupled to the other input port of pipeline butterfly 720without delay. It will be noted that there is a one-clock delay 716between locations G and E, and another between locations H and H′, sothe delays tend to “cancel” to thereby bring signals simultaneouslyapplied to locations B and C of FIG. 7 into time alignment at locationsH′ and I. The time-aligned signals are applied to butterfly processor720 of FIG. 7, to produce processed signals at locations J and K, asillustrated in FIG. 8 g. Locations J and K of FIG. 7 are delayed by twolocal clock cycles relative to locations H′ and I. Referring to FIG. 8g, the starting time is indicated as being the 6th clock cycle.

In FIG. 7, a two-clock-cycle (Z⁻²) delay 724 is interposed betweenlocations K and M, and no further delay is placed between locations Jand L. Consequently, a net two-clock delay is introduced, which issuggested by the start time of “8” in FIG. 8 h. More particularly, thesignals at location L are equated to those at J, and the signals at Mare delayed by two clock periods relative to those at location K. InFIG. 7, the signals at locations L and M are applied to a criss-crossswitch 726, which is controlled by the signal of FIG. 8 i in the samemanner as switch 722 is controlled by the signal of FIG. 8 e, but at arate equal to ¼ the local clock rate. This criss-cross switching resultsin the coupling of signal to locations O and P as illustrated in FIG. 8j. More particularly, during the first half-cycle of the switch controlclock of FIG. 8 i, signals 0 and 1 at location L are coupled to locationO. During the second half-cycle of control 8 i, signals 4 and 5 atlocation L are coupled to P, and signals 2 and 3 at location M arecoupled to location O. During the third half-cycle of control signal 8i, signals 6,7 at location M are coupled to P and signals 8,9 atlocation L are coupled to O. During the fourth half-cycle of switch 726control signal 8 i, signals 10, 11 at location M are coupled to O, andsignals 12, 13 at location L are coupled to P.

In FIG. 7, the signal at location O is coupled to location O′ by way ofa further two-local-system-clock cycle delay (Z⁻²) designated 728. Nodelay is interposed in the path associated with location P. As a result,the signals arriving at the input nodes or ports of butterfly processor730 have no relative delay. Again, butterfly processor 730 is assumed tohave a two-local-system-clock delay, which introduces no relative delaybetween the two paths. Consequently, the signals arriving at butterflyoutput locations Q and R of FIG. 7 are as illustrated in FIG. 8 k.Signal at location R of FIG. 7 is coupled to location T by way of afour-cycle (Z⁻⁴) delay 732, with the result that the signal arriving atlocation S of FIG. 7 is advanced relative to the signal arriving atlocation T by four clock cycles, as illustrated in FIG. 8 f. Theindicated start time in FIG. 8 l IS “12.” The signals at locations S andT are applied to a criss-cross switch 734, which operates under thecontrol of the control signal illustrated in FIG. 8 m to couple thesignals 0, 1, 2, 3 from location S to location U, signals 8, 9, 10, and11 from location S to location V, signals 4, 5, 6, and 7 from location Tto location U, and signals 12, 13, 14, and 15 from location T tolocation V, as illustrated in FIGS. 8 l, 8 m, and 8 n. A furtherfour-clock-cycle delay element 736 delays the U signal proceeding to theinput U′ of butterfly processor 738, to thereby bring the signalsapplied to butterfly processor 738 into temporal alignment, so thatsignal sets 0,8; 1,9; 2,10; 3,11; 4,12; 5,13; 6,14; and 7,15 aretemporally aligned for application to the input ports of butterflyprocessor 738. Finally, butterfly processor 738 processes the fourthstage of FFT and produces the signal set of FIG. 8 o at its outputs Wand X.

In general, control of a particular stage of the arrangement of FIG. 7is based upon an index Ψ_(x), where x represents the next-higher stageof butterflies of FIG. 3. Thus, control of the first stage butterfly 714of FIG. 7 by controller 754 uses the second-stage butterfly index ldescribed in conjunction with FIG. 5, control of the second stagebutterfly 720 of FIG. 7 by controller 750 uses the third-stage butterflyindex Ψ₂, and control of the third stage butterfly 726 of FIG. 7 bycontroller 760 uses the fourth- or last-stage butterfly index Ψ₃. Thelast stage pipeline butterfly of FIG. 7, namely butterfly 738, iscontrolled by controller 768 using the selected output bin index O_(S),which in the case of the four-butterfly pipeline of FIG. 7 is O₄.

In FIG. 7, blocks 754, 750, 760, and 768 represent controllers forcontrolling the operation of pipeline butterfly stages 718, 720, 730,and 738, respectively, in accordance with an aspect of the invention.FIG. 9 is a simplified diagram in block and schematic form illustratingthe

stage of DIF butterfly and its control arrangement. First-stagecontroller 754 contains two one-bit control memories m₁ ⁰ and m₁ ¹,where the subscript refers to the stage number, and the superscript 0represents control of the adder in the associated butterfly, and thesuperscript 1 represents control of the subtractor. Similarly,controller 750 controlling the second-stage pipeline butterfly 720contains four one-bit memories m₂ ⁰, m₂ ¹, m₂ ², m₂ ³, which controladders, subtractors, and multipliers of the butterfly of the secondstage. Controller 760 controlling the third-stage pipeline butterfly 730contains eight one-bit memories m₃ ⁰, m₃ ¹, m₃ ², m3 ₂ ³, m₃ ⁴, m₃ ⁵, m₃⁶m₃ ⁷, designated together as m₃ ^(x) which control adders, subtractors,and multipliers of the butterfly of the third stage, and controller 768controlling the fourth-stage pipeline butterfly 730 contains sixteenone-bit memories m₄ ⁰, m₄ ¹, m₄ ², m3 ₄ ³, m₄ ⁴, m₄ ⁵, m₄ ⁶, m3 ₄ ⁷,m_(m3) ₄ ¹¹, m₄ ¹², m₄ ¹³, m₄ ¹⁴, m₄ ¹⁵, designated jointly as m₄ ^(x),which control adders, subtractors, and multipliers of the butterfly ofthe fourth stage.

The values contained in the memories may be fixed during computations ifthe output bin set is defined and remains unchanged from time to time.The values contained in the memories may require updating from time totime if the output bin set changes from time to time.

In general, the one-bit memories of controllers 754, 750, 760, and 768of FIG. 7 are designated by m_(stage number j) ^(memory member i) orm_(j) ^(i). In general, the memory controls the subtractor when2^(j−1) ≦i≦2^(j)−1,the memory controls the adder when0≦i≦2^(S−1)−1, andthe Boolean sum of the signal or bit stored in memory pairm_(j) ^(i), m_(j) ^(i+2) ^(j−1)   (7)controls the multiplier, where 0≦i≦2^(j−1).

The following table represents the translation between Ψ₂ and m₁ ⁰, m₁¹, meaning that it relates to the application of the second-stagebutterfly index set to the first stage control memory.

Ψ₁ = null m₁ ⁰ = 0, m₁ ¹ = 0 Ψ₁ = {0} m₁ ⁰ = 1, m₁ ¹ = 0 Ψ₁ = {1} m₁ ⁰ =0, m₁ ¹ = 1 Ψ₁ = {0,1} m₁ ⁰ = 1, m₁ ¹ = 1If the bracketed index {} contains butterfly index k, then m₁ ^(k)=1,else m₁ ^(k)=0.

FIGS. 9 a and 9 b are simplified block diagrams of a system forgenerating control signals for the various butterfly processors of FIG.7, so as to cause the pruned or reduced-processing operation accordingto an aspect of the invention. More particularly, FIG. 9 a is a systemfor controlling in a DIT-type processor, and FIG. 9 b represents asystem for controlling a DIF type processor.

In FIG. 9 a, the butterfly nodes are designated as 910, 912, 914, and916. The signal applied to input node or port 912 is multiplied by aweighting factor W^(P) in a multiplier 920. An adder 918 is coupled toinput node 910 and to the output port of multiplier 920, for addingtogether the signals therefrom, under the control of the contents fromm_(j) ^(i) memory 922. A subtractor 928 is coupled to receive signalfrom input node 910 and from the output of multiplier 920, forsubtracting the two signals under the control of m_(j) ^(Y) memory 930,where Y=(i+2^(j−1)). The weighting multiplication performed inmultiplier 920 is controlled by the output of a Boolean summing circuit932, which receives as its input signals the sum of m_(j) ^(i) and m_(j)^(Y). One bit controls multiplier 932 to the active or idle state (holdoverbar). In the active state, the input signal from port 912 ismultiplied by the specified weight, and in the idle mode, it simplyholds its previous value. This previous value is not used, so may beconsidered to be garbage. More particularly, if the one-bit memorysignals produced by memories 922 and 930 of FIG. 9 a are both 0, theirsum is 0, and the multiplier assumes its idle state. If either or bothof the one-bit memory output signals are 1, their sum is considered tobe 1, and multiplier 920 assumes its active state. Similarly, adder 918and subtractor 920 are active when their control signals are logic high,and inactive or idle when their control signals are low.

The DIF butterfly of FIG. 9 b includes elements corresponding to thoseof FIG. 9 a, and these elements are designated by the same referencenumerals. In FIG. 9 b, the signals applied to input ports 910 and 912are applied to summer 918 and to subtractor 928. The output signal ofadder 918 is coupled directly to output port 914, and the output signalfrom subtractor 928 is applied to a weighting multiplier 920. Themultiplied output signal from multiplier 920 is applied to output port916. Summing circuit 918 is controlled by the m_(j) ^(i) signal from amemory 950, and subtracting circuit 928 is controlled by the m_(j)^(i+2(j−1)) signal from a memory 952. The memory outputs are alsoapplied to an adding circuit or adder 932, the output of which controlsthe weighting multiplier 920.

The timing of the controls of FIGS. 9 a and 9 b must take into accountthat the pipeline processor with which it is to be used has j stages, asindicated by the j subscripts of the memory indices. The jth stagecontrol block (including memories 922, 930, and summing circuit 932)count the local system clock by 2^(j−1). During the first clock cycle,the contents from m_(j) ⁰ 922 and its paired element m_(j) ^(Y) 930,where Y=2^(j−1), are loaded into the two memories 922 and 930.

The arrangement of FIG. 10 is a simplified representation of the

stage of DIT butterfly, including details of the control. In FIG. 10,elements corresponding to those of FIG. 9 a are designated by likereference numerals. In FIG. 10, control of summing circuit 918 isprovided by a buffer designated 1022, and control of summing circuit 928is provided by a buffer designated 1024. Buffers 1022 and 1024 receivedtheir input signals from a memory designated generally as 1010, which ingeneral produces two outputs at a time, namely those applied to buffers1022 and 1024 from memory output ports 1010 a and 1010 b. The outputsignal produced by memory 1010 at its output ports 1010 a and 1010 b iscontrolled by a pointer, illustrated as 1010 p, which at any given timepoints to or addresses one pair of memory locations, so as to select thesignals stored in that memory location for coupling to the output ports.The pointer is controlled by a simple counter, which counts the localclock by 2^(j−1) in a periodic fashion. At time or clock cycle 0, thecounter-controlled pointer points to memory addresses m_(j) ¹ and m_(j)^(Y), where Y=2^(j−1). At time 1, the pointer points to m_(j) ¹ andm_(j) ^(1+Y), again where Y=2^(j−1). At a later time i, the pointer 1010p points to m_(j) ^(i) and m_(j) ^(i+Y). Finally, just before the countturns over, the pointer 1010 p points to the memory addressesrepresented by m_(j) ^(Y−1) and m_(j) ^(Y−1). This control provides theproper timing for pruned operation in accordance with an aspect of theinvention.

The arrangement of FIG. 11 is a simplified representation of the

stage of DIF butterfly, including details of the control. In FIG. 11,elements corresponding to those of FIG. 9 b are designated by likereference numerals. In FIG. 11, control of summing circuit 918 isprovided by the buffer designated 1022, and control of summing circuit928 is provided by the buffer designated 1024. Buffers 1022 and 1024received their input signals from a memory designated generally as 1110,which in general produces two outputs at a time, namely those applied tobuffers 1022 and 1024 from memory output ports 1110 a and 1110 b. Theoutput signal produced by memory 1110 at its output ports 1110 a and1110 b is controlled by a pointer, illustrated as 1110 p, which at anygiven time points to or addresses one pair of memory locations, so as toselect the signals stored in that memory location for coupling to theoutput ports. The pointer 1110 p is controlled by a simple counter,which counts the local clock by 2^(j−1). At time or clock cycle 0, thecounter-controlled pointer 1110 p points to memory addresses m_(j) ⁰ andm_(j) ^(Y), where Y=2^(j−1). At time 1, the pointer points to m_(j) ¹and m_(j) ^(1+Y), again where Y=2^(j−1). At a later time i, the pointer1110 p points to m_(j) ^(i) and m_(j) ^(i+Y). Finally, just before thecount turns over, the pointer 1110 p points to the memory addressesrepresented by m_(j) ^(Y−1) and m_(j) ^(Y−1). This control provides theproper timing for pruned operation in accordance with an aspect of theinvention.

Mapping from j+

stage butterfly index set Ψ_(j) to

stage memory bits M_(j) ^(i) (1≦j≦S−1, 0≦i≦2^(j−1)−1) is determined by

-   -   (a) for (1≦j≦S−1), (for stage j)        -   (i) if (kεΨ_(j)) or Ψ_(j) contains index k, then m_(j)            ^(k)=1.        -   (ii) if (k∉Ψ_(j)), then m_(j) ^(k)=0.    -   (b) for j=S, (for stage S or the last stage)        -   (i) if (kεO_(S)), or O_(S) contains index k, then m_(j)            ^(k)=1.        -   (ii) if (k∉O_(S)), then m_(j) ^(k)=1.            Control of the memory pair is determined at stage j by m_(j)            ^(i) (0≦i≦2^(j−1)−1) and m_(j) ^(i+Y), (Y=2^(j−1)). When 0≦i            ≦(2^(j−1)−1), m_(j) ^(i) controls the butterfly adder, and            its pair memory element m_(j) ^(i+y) controls the butterfly            subtractor. The butterfly multiplier is controlled in            accordance with the Boolean OR of m_(j) ^(i) and m_(j)            ^(i+Y)            m_(j) ^(i)⊕m_(j) ^(i+2) ^((j−1))   (8)

Timing for control of the loading of the memory contents for m_(j)^(i+Y) at the

stage butterfly is determined by counting the system clock at the

stage by 2^(j−1); at the first system clock, the contents of memoriesm_(j) ⁰ and m_(j) ^(Y) are loaded to control the butterfly. At thesecond system clock, the contents of memories m_(j) ¹ and m_(j) ^(Y−1)are loaded to control the butterfly. This process continues from clockcycle to clock cycle, until, at the

clock cycle, the contents of memories m_(j) ^(n) and m_(j) ^(Y−1) areloaded to control the butterfly. The process repeats by loading thecontents of memories m_(j) ⁰ and m_(j) ^(Y) for the next system clock,and so on.

Other embodiments of the invention will be apparent to those skilled inthe art. For example, the digital data may be in serial or parallelform. The algorithm can also be applied to parallel pipeline processing.The algorithm, with minor modification, can be applied to non-radix-2applications, such as radix 4 and the prime-number radix FFT.

1. A computer readable medium including a program having instructions,which when executed perform a radix 2 fast fourier transform on adigital series to produce signals in cyclically noncontinuous outputbins that are useable in demultiplexing carrier signals, theinstructions comprising: determining the number 2^(s) of FFT points, theoutput bin index O_(s), and the input signal array; determining thebutterfly index for the last stage by$\Psi_{S - 1} = {O_{S}\mspace{14mu}\%\mspace{14mu}\left( \frac{N}{2} \right)}$determining the butterfly index for each stage other than said laststage by$\Psi_{l - 1} = {\Psi_{l}\mspace{14mu}\%\mspace{14mu}\left( \frac{N}{2^{S - l + 1}} \right)}$where % means modulo operation and where 1 varies from 1 to (S−1); usingsaid butterfly index, calculating only those butterflies necessary forcalculation of the output bins, providing to storage or furtherprocessing said signals in cyclically noncontinuous output bins, andapplying said signals from said cyclically noncontinuous output bins todemultiplexing at least one carrier.
 2. The computer readable mediumaccording to claim 1, wherein said determining the butterfly index forall later stages is performed in numerical order.
 3. The computerreadable medium according to claim 2, wherein said numerical order isascending order.
 4. The computer readable medium according to claim 1,further including the determination of output bins, wherein: for stagel, where l varies from 1 to S executing only that butterfly in thebutterfly index set Ψ_(l−1) of that stage; for stage l, loading thetwiddle factor corresponding to the butterfly index set Ψ_(l−1) of thatstage; and repeating (a) executing only tat butterfly in the butterflyindex set Ψ_(l−1) of that stage and (b) loading the twiddle factorcorresponding to the butterfly index set Ψ_(l−1) of that stage, untilthe required final stage butterflies are executed and the requiredoutput bins are filled.
 5. The computer readable medium according toclaim 4, wherein setting the butterfly index includes, when0≦i≦(2^(j−1)−1): controlling the butterfly adder with m_(j) ^(i)controlling the butterfly subtractor with m_(j) ^(i+Y); and controllingthe butterfly multiplier in accordance with the Boolean OR of m_(j) ^(i)and m_(j) ^(i+Y).
 6. The computer readable medium according to claim 1,wherein using said butterfly index further comprises: setting thebutterfly index set Ψ_(j) where (1≦j≦S−1) and the selected output nodeindex set ranges from O_(s) to M_(s) ^(i) by (a) for (1≦j≦S−1) (i) if (kε Ψ_(j)) or Ψ_(j), contains index k, then setting m_(j) ^(k)=1, (ii) if(k ε Ψ_(j)), then setting m_(J) ^(k)=0 (b) for j=S (i) if (k ε O_(s)),or O_(s) contains index k, then setting m_(j) ^(k)=1 (ii) if (k ∉O_(s)), or O_(s), then setting m_(j) ^(k)=1; and controlling of a memorypair stage j by m_(j) ^(i) (O≦i≦2^(j−1)−1) and m_(j) ^(i+Y),(Y=2^(j−1)).
 7. A method for producing signals in cyclicallynoncontinuous output bins that are useable in demultiplexing carrierwaves, the method comprising: determining the number 2 ^(s) of FFTpoints, the output bin index O_(s), and the input signal array;determining the butterfly index for the last stage by$\Psi_{S - 1} = {O_{S}\mspace{14mu}\%\mspace{14mu}\left( \frac{N}{2} \right)}$determining the butterfly index for each stage other than said laststage by$\Psi_{l - 1} = {\Psi_{l}\mspace{14mu}\%\mspace{14mu}\left( \frac{N}{2^{S - l + 1}} \right)}$where where % means modulo operation and varies from 1 to (S−1); usingsaid butterfly index, calculating only those butterflies necessary forcalculation of the output bins, providing to storage or furtherprocessing said signals in cyclically noncontinuous output bins, andapplying said signals from said cyclically noncontinuous output bins todemultiplexing at least one carrier wave.
 8. The method according toclaim 7, wherein said determining the butterfly index for all laterstages is performed in numerical order.
 9. The method according to claim8, wherein said numerical order is ascending order.
 10. The methodaccording to claim 7, further including the determination of outputbins, wherein: for stage l, where l varies from 1 to S, executing onlythat butterfly in the butterfly index set Ψ_(l−1) of that stage; forstage l, loading the twiddle factor corresponding to the butterfly indexset Ψ_(l−1) of that stage; and repeating (a) executing only thatbutterfly in the butterfly index set Ψ_(l−1) of that stage and (b)loading the twiddle factor corresponding to the butterfly index setΨ_(l−1) of that stage, until the required final stage butterflies areexecuted and the required output bins are filled.